## 5 GHz all-digital delay-locked loop for future memory systems beyond double data rate 4 synchronous dynamic random access memory

Dongyeol Lee and Jongsun Kim<sup>™</sup>

A new low-power, fast-locking, all-digital delay-locked loop (DLL) that uses a disposable time-to-digital converter (TDC) is presented for future memory systems beyond double data rate 4. To achieve fast locking and high-frequency operation, the proposed DLL utilises a new hybrid (TDC+binary+sequential) search algorithm that results in a fast locking time of 11 clock cycles without the false lock and harmonic lock problems. By minimising the intrinsic delay of the digital delay line, the proposed DLL achieves an operating frequency range of 1.5–5.0 GHz which is higher than that of the current state-of-the-art all-digital DLLs. The DLL is fabricated in a 65 nm CMOS process and it achieves a peak-to-peak (p-p) output clock jitter of 14 ps (with a p-p input clock jitter of 8 ps) at 5 GHz. The DLL consumes 6.9 mW at 1 V and occupies an active area of 0.025 mm².

Introduction: One of the main challenges in the design of memory interface technologies for next-generation main memories is the implementation of an all-digital delay-locked loop (DLL) that can provide multi-gigahertz (GHz) operations. The double data rate 4 (DDR4) synchronous dynamic random access memory (SDRAM) currently being used in top line computing systems operates at 1.6 GHz and offers data rates of 3.2 Gbits/s/pin. To achieve faster performance beyond DDR4, next-generation SDRAMs require all-digital DLLs that can operate up to 3.2 GHz or higher while maintaining fast locking, low power, small area, and low-jitter performances. Although traditional analogue DLLs are suitable for a wide operating frequency range, they cannot provide fast wake-up from power-down mode. Consequently, DLLs in DDR3 and DDR4 have been implemented digitally. Moreover, post-DDR4 DRAMs would require low standby power consumption as well as fast power-mode transition to minimise read latency. Traditional digital DLLs [1-5], however, usually have a maximum frequency range limitation of <3 GHz, which is completely unsuitable for the frequency performance requirement of future memory systems beyond DDR4. The maximum frequency of a traditional DLL is limited mainly by the long intrinsic delay of the clock path through the delay line and the long control loop delay of the control logic [4]. Minimising the intrinsic delay of the delay line is complicated in conventional DLLs because the initial delay constraint may cause false lock or harmonic lock problems [1, 4].

This Letter proposes a new all-digital DLL architecture with a short intrinsic delay in the digital delay line and a simple control logic that prevents the false lock and harmonic lock problems. To achieve both high-frequency operation and fast-locking capability while consuming low power, the proposed DLL adopts a new disposable time-to-digital converter (TDC) and utilises a new hybrid (TDC + binary + sequential) search algorithm. The proposed DLL achieves a maximum operation frequency of up to 5 GHz.

Architecture and circuit design: Fig. 1a illustrates the proposed all-digital DLL, which consists of a digitally controlled delay line (DCDL) comprising a fine delay line (FDL) and a coarse delay line (CDL), a timing controller, a TDC, a shift register (SR), an SR controller (SRC), a successive approximation register (SAR), and a phase detector. The clock tree and the replica clock buffer are omitted in this design for simplicity. As shown in the top of Fig. 1b, this DLL has three operating modes: TDC search, binary search, and sequential tracking. First, the TDC-controlled CDL finds the suitable delay in three cycles to bring the DLL into coarse lock. Then the SAR-controlled FDL finishes phase tracking in eight clock cycles to put the DLL into fine lock. Then the DLL runs a sequential search to continuously track phases over process, voltage, and temperature (PVT) variations, resulting in reduced dithering jitter.

Fig. 2 shows the schematic of the proposed CDL and TDC blocks. The CDL consists of 15 course delay units (CDUs) with a resolution of  $t_{\rm CDU} = t_{\rm cyc}/16$ , where  $t_{\rm CDU}$  is the propagation delay of the NOR-based CDU and  $t_{\rm cyc}$  is the period of CLK<sub>IN</sub>. The TDC consists of 16 *D* flip-flop-based registers synchronised with CLK<sub>IN</sub>. The FDL shown in Fig. 1*a* adopts a 4 bit inverter-based feedback delay element

[6] that utilises positive feedback to achieve a digitally adjustable linear delay with small intrinsic delay. The tunable delay range of the FDL, which is equal to one CDU delay, is controlled by the 4 bit SAR that has a delay resolution of  $t_{\rm CDU}/2^4$ . When the DLL begins operation (START = high), the DLL first performs TDC search. Initially, the SAR-controlled FDL generates CLK<sub>FD</sub> with its minimum delay and the bypass logic of the TDC selects only the final MUX with C[0] = '1' to generate CLK<sub>OUT</sub> with a minimum DCDL delay.



Fig. 1 Proposed all-digital DLL using disposable TDC

- a Overall architecture
- b Locking process



Fig. 2 Schematic of proposed CDL (Top) and TDC (bottom) block

Unlike the architecture [4] with a FDL located at the end of the CDL, the FDL in this design is located in front of the CDL so that the intrinsic delay of the FDL is included in the calculation of the proper TDC code. This enables the TDC to find the suitable CDL delay more correctly in three clock cycles as shown in Figs. 1b and 2. In the first cycle (delay generation), the delayed D[14:0] signals are generated by the delay chain of the CDL. The second cycle (phase detection) is used for comparing the phase difference between the reference CLK<sub>IN</sub> and the D[14:0] signals and for generating the TDC[15:0] code signals. The third cycle (store) is used for loading the TDC[15:0] code into the SR after which the disposable TDC can be turned off to reduce power.

Finally, the SR generates the CDL control codes, C[15:1], for selecting the correct CDL output and the TDC search mode is completed with a coarsely locked phase difference of less than one  $t_{CDU}$  delay. After the TDC search mode, the DLL enters the binary search mode to initiate fine locking. The variable delay magnitude of the FDL is adjusted by the F [3:0] signals generated by the 4 bit SAR which is synchronised with the divide-by-2 CLK<sub>IN</sub>. In the binary search mode, the quantisation error caused by the delay cell of the CDL is further minimised within eight (= 4 × 2) clock cycles by the 4 bit FDL, resulting in a fine delay resolution of  $t_{\text{CDU}}/2^4 = 2.5 \text{ ps}$  in this design. After the binary search is completed, the DLL starts sequential tracking by converting the 4 bit SAR into a sequential counter. Moreover, a 4 bit counter operation is activated in the SR by the UP/DN and SHIFT signals generated by the SRC. The SAR and the SR together operate as an 8 bit counter and the DLL maintains a closed-loop to track PVT variation continuously. This TDC-based hybrid search algorithm inherently avoids the false lock and harmonic lock problems. By bypassing the CDUs in the CDL at high frequencies, a minimum intrinsic delay of 180 ps is achieved for the DCDL, resulting in a maximum simulated operating frequency of 5.5 GHz.



Fig. 3 Die microphotograph and test CoB of proposed DLL



Fig. 4 Experimental results

- a Measured locking process at 3.0 GHz
- b Measured input and output clock jitter at 5.0 GHz

Measurement results: The proposed all-digital DLL was implemented in a 65 nm 1.0 V CMOS process and tested in a chip-on-board (CoB) assembly. Fig. 3 shows a die microphotograph and a test CoB of the proposed DLL, which has an active area of 0.025 mm². Fig. 4a shows the measured locking process of the DLL at 3 GHz, taking three cycles for coarse locking and eight cycles for fine locking. The LOCK signal goes to high after the third clock cycle, which is a clear indication that the DLL has achieved fast locking as illustrated in Fig. 1b. The arbitrary waveform generator (Anritsu MP1763C) is used to provide the input reference clock to the test CoB. As shown in Fig. 4b, with a measured peak-to-peak (p−p) input clock jitter of 8 ps, the p−p and the root-mean-square jitter of the output clock at 5.0 GHz are 14 and 1.567 ps, respectively. The effective p−p output clock jitter is ∼6 ps under the assumption that the input jitter is included in the output [3].

The proposed DLL achieves a frequency range of 1.5–5.0 GHz and dissipates 6.9 mW at 5 GHz (= 1.38 mW/GHz). As shown in Table 1, compared with the state-of-the-art multi-GHz digital DLLs, the proposed DLL achieves almost 2× higher maximum operating frequency. Although the output clock p–p jitter of [4] is the lowest, the maximum operating frequency of [4] is limited to 2.73 GHz. Although [5] achieves fast locking of two cycles, its p–p jitter is too high. The proposed DLL achieves the highest operating frequency among the state-of-the-art digital DLLs, while maintaining a small area, low jitter, and fast locking performance.

Table 1: Performance comparison of state-of-the-art DLLs

|                                | [3]              | [4]             | [5]                | This work       |
|--------------------------------|------------------|-----------------|--------------------|-----------------|
| Process and supply             | 130 nm/<br>1.5 V | 55 nm/<br>1.0 V | 65 nm/1.2 V        | 65 nm/<br>1.0 V |
| Active area (mm <sup>2</sup> ) | 0.03             | 0.018           | 0.016              | 0.025           |
| Minimum frequency<br>(GHz)     | 1.5              | 0.1             | 0.3                | 1.5             |
| Maximum frequency (GHz)        | 2.5              | 2.5             | 3.0                | 5.0             |
| Locking time (cycles)          | 24               | 8               | 2                  | 11(= 3 + 8)     |
| Input clock p-p jitter (ps)    | NA               | NA              | NA                 | 8 at 5 GHz      |
| Output clock p-p jitter (ps)   | 14 at<br>2.5 GHz | 3 at<br>2.5 GHz | 29.4 at<br>1.6 GHz | 14 at<br>5 GHz  |
| Normalised power<br>(mW/GHz)   | 12               | 0.784           | 0.833              | 1.38            |

Conclusion: A high-speed all-digital DLL for future memory systems beyond DDR4 is presented in this Letter. By utilising a new hybrid search algorithm and a disposable TDC, the DLL achieves fast locking while consuming low power. The proposed TDC-based hybrid search DLL inherently avoids the false lock and harmonic lock problems and achieves high-frequency operation up to 5 GHz by minimising the intrinsic delay of the DCDL. The proposed all-digital DLL can be easily adopted in post-DDR4 SDRAMs that require both an operating frequency of over 3.2 GHz and a fast locking time with a small area, low jitter, and low power dissipation.

Acknowledgments: This work (C0336939) was supported by the Business for Cooperative R&D between Industry, Academy, and Research Institute funded Korea Small and Medium Business Administration in 2015.

© The Institution of Engineering and Technology 2015 Submitted: *15 August 2015* E-first: *30 October 2015* doi: 10.1049/el.2015.2876

One or more of the Figures in this Letter are available in colour online.

Dongyeol Lee and Jongsun Kim (Electronic and Electrical Engineering, Hongik University, 94 Wausan-ro, Mapo-gu, Seoul 121791, Republic of Korea)

⊠ E-mail: js.kim@hongik.ac.kr

## References

- Chung, C., and Lee, C.: 'A new DLL-based approach for all-digital multiphase clock generation', *IEEE J. Solid-State Circuits*, 2004, 39, pp. 469–475
- 2 Kang, H., Ryu, K., Jung, D., et al.: 'Process variation tolerant all-digital 90° phase shift DLL for DDR3 interface', IEEE Trans. Circuits Syst., 2012, 59, pp. 2186–2196
- 3 Yang, R., and Liu, S.: 'A 2.5 GHz all-digital delay-locked loop in 0.13 µm CMOS technology', *IEEE J. Solid-State Circuits*, 2007, 42, pp. 2338–2347
- 4 Wang, J., and Cheng, C.: 'An all-digital delay-locked loop using an in-time phase maintenance scheme for low-jitter gigahertz operations', *IEEE Trans. Circuits Syst.*, 2015, 62, pp. 395–404
- 5 Kim, K., Son, S., Ryu, S., et al.: 'A 1.3-mW, 1.6-GHz digital delay-locked loop with two-cycle locking time and dither-free tracking'. IEEE Symp. VLSI Circuits Dig. Tech. Pap., 2013, pp. 158–159
- 6 Han, S., and Kim, J.: 'A 0.1–1.5 GHz all-digital phase inversion delay-locked loop'. IEEE Asian Solid-State Circuits Conf. (A-SSCC) Dig. Tech. Pap., Singapore, November 2013, pp. 341–344